Fast boosting using adversarial bandits

نویسندگان

Róbert Busa-Fekete

Balázs Kégl

چکیده

In this paper we apply multi-armed bandits (MABs) to improve the computational complexity of AdaBoost. AdaBoost constructs a strong classifier in a stepwise fashion by selecting simple base classifiers and using their weighted “vote” to determine the final classification. We model this stepwise base classifier selection as a sequential decision problem, and optimize it with MABs where each arm represents a subset of the base classifier set. The MAB gradually learns the “usefulness” of the subsets, and selects one of the subsets in each iteration. AdaBoost then searches only this subset instead of optimizing the base classifier over the whole space. The main improvement of this paper over a previous approach is that we use an adversarial bandit algorithm instead of stochastic bandits. This choice allows us to prove a weak-to-strong-learning theorem, which means that the proposed technique remains a boosting algorithm in a formal sense. We demonstrate on benchmark datasets that our technique can achieve a generalization performance similar to standard AdaBoost for a computational cost that is an order of magnitude smaller.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits

We present an algorithm that achieves almost optimal pseudo-regret bounds against adversarial and stochastic bandits. Against adversarial bandits the pseudo-regret is O ( K √ n log n ) and against stochastic bandits the pseudo-regret is O ( ∑ i(log n)/∆i). We also show that no algorithm with O (log n) pseudo-regret against stochastic bandits can achieve Õ ( √ n) expected regret against adaptive...

متن کامل

The Best of Both Worlds: Stochastic and Adversarial Bandits

We present a new bandit algorithm, SAO (Stochastic and Adversarial Optimal) whose regret is (essentially) optimal both for adversarial rewards and for stochastic rewards. Specifically, SAO combines the O( √ n) worst-case regret of Exp3 (Auer et al., 2002b) and the (poly)logarithmic regret of UCB1 (Auer et al., 2002a) for stochastic rewards. Adversarial rewards and stochastic rewards are the two...

متن کامل

Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments

EXP3 is a popular algorithm for adversarial multiarmed bandits, suggested and analyzed in this setting by Auer et al. [2002b]. Recently there was an increased interest in the performance of this algorithm in the stochastic setting, due to its new applications to stochastic multiarmed bandits with side information [Seldin et al., 2011] and to multiarmed bandits in the mixed stochastic-adversaria...

متن کامل

An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits

We present a new strategy for gap estimation in randomized algorithms for multiarmed bandits and combine it with the EXP3++ algorithm of Seldin and Slivkins (2014). In the stochastic regime the strategy reduces dependence of regret on a time horizon from (ln t) to (ln t) and eliminates an additive factor of order ∆e 2 , where ∆ is the minimal gap of a problem instance. In the adversarial regime...

متن کامل

Reactive bandits with attitude

We consider a general class of K-armed bandits that adapt to the actions of the player. A single continuous parameter characterizes the “attitude” of the bandit, ranging from stochastic to cooperative or to fully adversarial in nature. The player seeks to maximize the expected return from the adaptive bandit, and the associated optimization problem is related to the free energy of a statistical...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Fast boosting using adversarial bandits

نویسندگان

چکیده

منابع مشابه

An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits

The Best of Both Worlds: Stochastic and Adversarial Bandits

Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments

An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits

Reactive bandits with attitude

عنوان ژورنال:

اشتراک گذاری